Picture for Jiaxin Mao

Jiaxin Mao

Tournament-GRPO: Group-Wise Tournament Rewards for Reinforcement Learning in Open-Ended Long-Form Generation

Add code
May 26, 2026
Viaarxiv icon

UnityMAS-O: A General RL Optimization Framework for LLM-Based Multi-Agent Systems

Add code
May 26, 2026
Viaarxiv icon

PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training

Add code
Apr 04, 2026
Viaarxiv icon

DiffuRank: Effective Document Reranking with Diffusion Language Models

Add code
Feb 13, 2026
Viaarxiv icon

Self-Compression of Chain-of-Thought via Multi-Agent Reinforcement Learning

Add code
Jan 29, 2026
Viaarxiv icon

JADE: Bridging the Strategic-Operational Gap in Dynamic Agentic RAG

Add code
Jan 29, 2026
Viaarxiv icon

MergeMix: Optimizing Mid-Training Data Mixtures via Learnable Model Merging

Add code
Jan 25, 2026
Viaarxiv icon

Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search

Add code
Jan 08, 2026
Viaarxiv icon

Entropy-Guided Token Dropout: Training Autoregressive Language Models with Limited Domain Data

Add code
Dec 29, 2025
Viaarxiv icon

$\text{E}^2\text{Rank}$: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker

Add code
Oct 26, 2025
Viaarxiv icon